AITopics | harmonic mean

Collaborating Authors

harmonic mean

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Self-Harmony: Learning to Harmonize Self-Supervision and Self-Play in Test-Time Reinforcement Learning

Wang, Ru, Huang, Wei, Cao, Qi, Iwasawa, Yusuke, Matsuo, Yutaka, Guo, Jiaxian

arXiv.org Artificial IntelligenceNov-4-2025

Test-time reinforcement learning (TTRL) offers a label-free paradigm for adapting models using only synthetic signals at inference, but its success hinges on constructing reliable learning signals. Standard approaches such as majority voting often collapse to spurious yet popular answers. We introduce Self-Harmony, a framework built on a simple intuition: the correct answer should remain stable across both an original question and its paraphrase. Self-Harmony operationalizes this by employing a single model in two complementary roles: a Solver to produce answers and a Reframer to rephrase the input. Based on this, we further propose a pseudo-label method: instead of majority voting, it aggregates answer frequencies across these original and reframed views using the harmonic mean. This is a process that naturally selects for solutions stable under reframing, thereby avoiding the common trap of favoring view-dependent, spurious answers. Crucially, this requires no human supervision or auxiliary models. Across diverse reasoning benchmarks, Self-Harmony achieves state-of-the-art results at the label-free test-time setting, ranking first in 28 of 30 settings across multiple methods. Beyond accuracy, it demonstrates unprecedented robustness, with zero training failures in all experiments, underscoring its stability and reliability.

large language model, machine learning, reinforcement learning, (15 more...)

arXiv.org Artificial Intelligence

2511.01191

Country:

North America > Mexico (0.28)
Asia > Middle East (0.28)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.61)

Add feedback

Automatic Prompt Generation via Adaptive Selection of Prompting Techniques

Ikenoue, Yohei, Tashiro, Hitomi, Kuroyanagi, Shigeru

arXiv.org Artificial IntelligenceOct-22-2025

Prompt engineering is crucial for achieving reliable and effective outputs from large language models (LLMs), but its design requires specialized knowledge of prompting techniques and a deep understanding of target tasks. To address this challenge, we propose a novel method that adaptively selects task-appropriate prompting techniques based on users' abstract task descriptions and automatically generates high-quality prompts without relying on pre-existing templates or frameworks. The proposed method constructs a knowledge base that associates task clusters, characterized by semantic similarity across diverse tasks, with their corresponding prompting techniques. When users input task descriptions, the system assigns them to the most relevant task cluster and dynamically generates prompts by integrating techniques drawn from the knowledge base. An experimental evaluation of the proposed method on 23 tasks from BIG-Bench Extra Hard (BBEH) demonstrates superior performance compared with standard prompts and existing automatic prompt-generation tools, as measured by both arithmetic and harmonic mean scores. This research establishes a foundation for streamlining and standardizing prompt creation, enabling non-experts to effectively leverage LLMs.

large language model, natural language, temperature parameter analysis, (18 more...)

arXiv.org Artificial Intelligence

2510.18162

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Expert Systems (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

SimulatorArena: Are User Simulators Reliable Proxies for Multi-Turn Evaluation of AI Assistants?

Dou, Yao, Galley, Michel, Peng, Baolin, Kedzie, Chris, Cai, Weixin, Ritter, Alan, Quirk, Chris, Xu, Wei, Gao, Jianfeng

arXiv.org Artificial IntelligenceOct-10-2025

Large language models (LLMs) are increasingly used in interactive applications, and human evaluation remains the gold standard for assessing their performance in multi-turn conversations. Since human studies are costly, time-consuming, and hard to reproduce, recent work explores using LLMs to simulate users for automatic assistant evaluation. However, there is no benchmark or systematic study to evaluate whether these simulated users are reliable stand-ins for real users. To address this, we introduce SimulatorArena, a benchmark of 909 annotated human-LLM conversations on two interactive tasks -- math tutoring and document creation. SimulatorArena evaluates simulators based on how closely their messages match human behavior and how well their assistant ratings align with human judgments. Experiments on various simulator methods show that simulators conditioned on user profiles, capturing traits like background and message styles, align closely with human judgments. They reach Spearman's $ρ$ of 0.7 on both tasks, providing a practical, scalable alternative to human evaluation. Using the best simulator for each task, we benchmark 18 assistants, including the latest LLMs such as GPT-5, Claude 4.1 Opus, and Gemini 2.5 Pro.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2510.05444

Country: North America > United States (0.27)

Genre:

Research Report > New Finding (0.67)
Research Report > Experimental Study (0.45)

Industry:

Education (1.00)
Health & Medicine > Therapeutic Area (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Network-Based Detection of Autism Spectrum Disorder Using Sustainable and Non-invasive Salivary Biomarkers

Fernandes, Janayna M., Sabino-Silva, Robinson, Carneiro, Murillo G.

arXiv.org Artificial IntelligenceSep-22-2025

Autism Spectrum Disorder (ASD) lacks reliable biological markers, delaying early diagnosis. Using 159 salivary samples analyzed by ATR-FTIR spectroscopy, we developed GANet, a genetic algorithm-based network optimization framework leveraging PageRank and Degree for importance-based feature characterization. GANet systematically optimizes network structure to extract meaningful patterns from high-dimensional spectral data. It achieved superior performance compared to linear discriminant analysis, support vector machines, and deep learning models, reaching 0.78 accuracy, 0.61 sensitivity, 0.90 specificity, and a 0.74 harmonic mean. These results demonstrate GANet's potential as a robust, bio-inspired, non-invasive tool for precise ASD detection and broader spectral-based health applications.

classification, evolutionary algorithm, machine learning, (20 more...)

arXiv.org Artificial Intelligence

2509.16126

Genre: Research Report > New Finding (0.88)

Industry: Health & Medicine > Therapeutic Area > Neurology > Autism (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
(2 more...)

Add feedback

SwiftF0: Fast and Accurate Monophonic Pitch Detection

Nieradzik, Lars

arXiv.org Artificial IntelligenceAug-27-2025

Accurate and real-time monophonic pitch estimation in noisy conditions, particularly on resource-constrained devices, remains an open challenge in audio processing. We present \emph{SwiftF0}, a novel, lightweight neural model that sets a new state-of-the-art for monophonic pitch estimation. Through training on diverse speech, music, and synthetic datasets with extensive data augmentation, SwiftF0 achieves robust generalization across acoustic domains while maintaining computational efficiency. SwiftF0 achieves a 91.80\% harmonic mean (HM) at 10 dB SNR, outperforming baselines like CREPE by over 12 percentage points and degrading by only 2.3 points from clean audio. SwiftF0 requires only 95,842 parameters and runs approximately 42x faster than CREPE on CPU, making it ideal for efficient, real-time deployment. To address the critical lack of perfectly accurate ground truth pitch in speech corpora (which typically rely on algorithmic estimators or laryngograph signals), we introduce \emph{SpeechSynth}. This synthetic speech dataset, generated by a phoneme-level TTS model, provides exact, on-demand ground-truth pitch curves, enabling more robust model training and evaluation. Furthermore, we propose a unified metric, combining six complementary performance measures for comprehensive and reliable pitch evaluation, and release an open-source pitch benchmark suite. A live demo of SwiftF0 is available at https://swift-f0.github.io/, the source code at https://github.com/lars76/swift-f0, and the benchmark framework at https://github.com/lars76/pitch-benchmark.

machine learning, natural language, swiftf0, (20 more...)

arXiv.org Artificial Intelligence

2508.1844

Genre: Research Report > New Finding (0.46)

Industry:

Media > Music (0.46)
Leisure & Entertainment (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)
Information Technology > Artificial Intelligence > Speech (0.93)
Information Technology > Architecture (0.69)
Information Technology > Artificial Intelligence > Natural Language (0.68)

Add feedback

8b0d268963dd0cfb808aac48a549829f-Supplemental.pdf

Neural Information Processing SystemsAug-15-2025, 18:29:14 GMT

artificial intelligence, class feature, machine learning, (18 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.34)

Add feedback

Shapley-Inspired Feature Weighting in $k$-means with No Additional Hyperparameters

Fawley, Richard J., de Amorim, Renato Cordeiro

arXiv.org Artificial IntelligenceAug-12-2025

Clustering algorithms often assume all features contribute equally to the data structure, an assumption that usually fails in high-dimensional or noisy settings. Feature weighting methods can address this, but most require additional parameter tuning. We propose SHARK (Shapley Reweighted $k$-means), a feature-weighted clustering algorithm motivated by the use of Shapley values from cooperative game theory to quantify feature relevance, which requires no additional parameters beyond those in $k$-means. We prove that the $k$-means objective can be decomposed into a sum of per-feature Shapley values, providing an axiomatic foundation for unsupervised feature relevance and reducing Shapley computation from exponential to polynomial time. SHARK iteratively re-weights features by the inverse of their Shapley contribution, emphasising informative dimensions and down-weighting irrelevant ones. Experiments on synthetic and real-world data sets show that SHARK consistently matches or outperforms existing methods, achieving superior robustness and accuracy, particularly in scenarios where noise may be present. Software: https://github.com/rickfawley/shark.

artificial intelligence, data mining, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2508.07952

Country: Europe > Austria (0.28)

Genre:

Research Report (0.82)
Overview (0.68)

Industry: Health & Medicine (0.46)

Technology:

Information Technology > Game Theory (1.00)
Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)

Add feedback

Flood-DamageSense: Multimodal Mamba with Multitask Learning for Building Flood Damage Assessment using SAR Remote Sensing Imagery

Ho, Yu-Hsuan, Mostafavi, Ali

arXiv.org Artificial IntelligenceJun-10-2025

Most post-disaster damage classifiers succeed only when destructive forces leave clear spectral or structural signatures -- conditions rarely present after inundation. Consequently, existing models perform poorly at identifying flood-related building damages. The model presented in this study, Flood-DamageSense, addresses this gap as the first deep-learning framework purpose-built for building-level flood-damage assessment. The architecture fuses pre- and post-event SAR/InSAR scenes with very-high-resolution optical basemaps and an inherent flood-risk layer that encodes long-term exposure probabilities, guiding the network toward plausibly affected structures even when compositional change is minimal. A multimodal Mamba backbone with a semi-Siamese encoder and task-specific decoders jointly predicts (1) graded building-damage states, (2) floodwater extent, and (3) building footprints. Training and evaluation on Hurricane Harvey (2017) imagery from Harris County, Texas -- supported by insurance-derived property-damage extents -- show a mean F1 improvement of up to 19 percentage points over state-of-the-art baselines, with the largest gains in the frequently misclassified "minor" and "moderate" damage categories. Ablation studies identify the inherent-risk feature as the single most significant contributor to this performance boost. An end-to-end post-processing pipeline converts pixel-level outputs to actionable, building-scale damage maps within minutes of image acquisition. By combining risk-aware modeling with SAR's all-weather capability, Flood-DamageSense delivers faster, finer-grained, and more reliable flood-damage intelligence to support post-disaster decision-making and resource allocation.

artificial intelligence, damage assessment, machine learning, (12 more...)

arXiv.org Artificial Intelligence

2506.06667

Country: North America > United States > Texas > Harris County (0.24)

Genre: Research Report (0.84)

Industry:

Health & Medicine > Therapeutic Area > Neurology (0.93)
Energy > Renewable > Geothermal > Geothermal Energy Exploration and Development > Geophysical Analysis & Survey (0.51)
Government > Regional Government > North America Government > United States Government (0.46)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)

Add feedback

Feasibility with Language Models for Open-World Compositional Zero-Shot Learning

Kim, Jae Myung, Alaniz, Stephan, Schmid, Cordelia, Akata, Zeynep

arXiv.org Artificial IntelligenceMay-19-2025

Humans can easily tell if an attribute (also called state) is realistic, i.e., feasible, for an object, e.g. fire can be hot, but it cannot be wet. In Open-World Compositional Zero-Shot Learning, when all possible state-object combinations are considered as unseen classes, zero-shot predictors tend to perform poorly. Our work focuses on using external auxiliary knowledge to determine the feasibility of state-object combinations. Our Feasibility with Language Model (FLM) is a simple and effective approach that leverages Large Language Models (LLMs) to better comprehend the semantic relationships between states and objects. FLM involves querying an LLM about the feasibility of a given pair and retrieving the output logit for the positive answer. To mitigate potential misguidance of the LLM given that many of the state-object compositions are rare or completely infeasible, we observe that the in-context learning ability of LLMs is essential. We present an extensive study identifying Vicuna and ChatGPT as best performing, and we demonstrate that our FLM consistently improves OW-CZSL performance across all three benchmarks.

large language model, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2505.11181

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.90)

Add feedback

Bayesian Principles Improve Prompt Learning In Vision-Language Models

Kim, Mingyu, Ko, Jongwoo, Park, Mijung

arXiv.org Artificial IntelligenceApr-22-2025

Prompt learning is a popular fine-tuning method for vision-language models due to its efficiency. It requires a small number of additional learnable parameters while significantly enhancing performance on target tasks. However, most existing methods suffer from overfitting to fine-tuning data, yielding poor generalizability. To address this, we propose a new training objective function based on a Bayesian learning principle to balance adaptability and generalizability. We derive a prior over the logits, where the mean function is parameterized by the pre-trained model, while the posterior corresponds to the fine-tuned model. This objective establishes a balance by allowing the fine-tuned model to adapt to downstream tasks while remaining close to the pre-trained model.

artificial intelligence, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2504.14123

Country: North America > United States (0.28)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
(2 more...)

Add feedback